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ABSTRACT 

We describe an algorithm for the extraction of the angular power spectrum of an 
intensity field, such as the cosmic microwave background (CMB), from interferometer 
data. This new method, based on the gridding of interferometer visibilities in the 
aperture plane followed by a maximum likelihood solution for bandpowers, is much faster 
than direct likelihood analysis of the visibilities, and deals with foreground radio sources, 
multiple pointings, and differencing. The gridded aperture-plane estimators are also 
used to construct Wiener-filtered images using the signal and noise covariance matrices 
used in the likelihood analysis. Results are shown for simulated data. The method 
has been used to determine the power spectrum of the cosmic microwave background 
from observations with the Cosmic Background Imager, and the results are given in 
companion papers. 

Subject headings: cosmic microwave background — cosmology: data analysis 

1. Introduction 

The technique of interferometry has been widely used in radio astronomy to image the sky 
using arrays of antennas. By correlating the complex voltage signals between pairs of antennas, 
the field-of-view of a single element can be sub-divided into "synthesized beams" of higher angular 
resolution. In the small-angle approximation, the interferometer forms the Fourier transform of 
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the sky convolved with the autocorrelation of the aperture voltage patterns. In standard radio 
interferometric data analysis, as described for example in the text by Thompson, Moran, & Swcnson 
(1986) and the proceedings of the NRAO Synthesis Imaging School (Taylor, Carilli, &: Pcrlcy 1999), 
the correlations or visibilities are inverse Fourier transformed back to the image plane. However, 
there are applications such as estimation of the angular power spectrum of fluctuations in the cosmic 
microwave background (CMB) where it is the distribution of and correlation between visibilities in 
the aperture or -uv-plane that is of most interest. 

In standard cosmological models, the CMB is assumed to be a statistically homogeneous Gaus- 
sian random field (Bond k, Efstathiou 1987). In this case, the spherical harmonics of the field are 
independent and the statistical properties are determined by the power spectrum where i labels 
the component of the Legendre polynomial expansion (and is roughly in inverse radians). Bond & 
Efstathiou (1987) showed that in cold dark matter inspired cosmological models, there would be 
features in the CMB power spectrum that reflected critical properties of the cosmology. Recent 
detections of the first few of these "acoustic peaks" at ^ < 1000 in the power spectrum (Lange et 
al. 2001; Hanany et al. 2001; Lee et al. 2001; Halverson et al. 2002; Netterfleld et al. 2001) have 
supported the standard inflationary cosmological model with Vltot ~ 1 • Measurement of the higher- 
t peaks and troughs, as well as the damping tail due to the finite thickness of the last scattering 
surface, is the next observational step. Interferometers are well-suited to the challenge of mapping 
out features in the CMB power spectrum, with a given antenna pair probing a characteristic i 
proportional to the baseline length in units of the observing wavelength (a lOOA projected baseline 
corresponds to ^ ~ 628, see §3). 

There are many papers in the literature on the analysis of CMB anisotropy measurements, 
estimation of power spectra, and the use of interferometry for CMB studies. General issues for 
analysis of CMB datasets are discussed in Bond, Jaffe, & Knox (1998, 2000). Hobson, Lasenby, & 
Jones (1995) present a Bayesian method for the analysis of CMB interferometer data, using the 
3-element Cosmic Anisotropy Telescope data as a test case. A description of analysis techniques for 
interferometric observations from the Degree Angular Scale Interferometer (DASI) were presented 
in White et al. (1999a,b), while Halverson et al. (2002) report on the power spectrum results from 
first-season of DASI observations. Ng (2001) discusses CMB interferometry with application to the 
proposed AMIBA instrument. Hobson & Maisinger (2002) have recently presented an approach 
similar to ours, and demonstrate their technique on simulated Very Small Array (VSA) data; a 
brief comparison of their algorithm with ours is given in Appendix C. 

In this paper, we describe a fast gridded method for the nw-plane analysis of large interferomet- 
ric data sets. The basis of this approach is to grid the visibilities and perform maximum likelihood 
estimation of the power spectrum on this compressed data. Our use of gridded estimators is sig- 
nificantly different from what has been done previously. In addition to power spectrum extraction, 
this procedure has the ability to form optimally filtered images from the gridded estimators, and 
may be of use in interferometric observations of radio sources in general. 
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We have applied our method to the analysis of data from the Cosmic Background Imager 
(CBI). The CBI is a planar interferometer array of 13 individual 90-cm Cassegrain antennas on a 6- 
m pointable platform (Padin et al. 2002). It covers the frequency range 26-36 GHz in 10 contiguous 
1 GHz channels, with a thermal noise level of 2 /iK in 6 hours, and a maximum resolution of 4' limited 
by the longest baselines. The CBI baselines probe £ in the range 500-3900. The 90-cm antenna 
diameters were chosen to maximize sensitivity, but their primary beamwidth of 45^2 (FWHM) at 
31 GHz limits the instantaneous field of view, which in turn limits the resolution in i. This loss 
of aperture plane resolution can be overcome by making mosaic observations, i.e. observations in 
which a number of adjacent pointings are combined (Ekers & Rots 1979; Cornwell 1988; Cornwell, 
Holdaway, & Uson 1993; Sault, Staveley-Smith, & Brouw 1996). In the CBI observations, mosaicing 
a field several times larger than the primary beam has resulted in an increase in resolution in £ by 
more than a factor of 3, sufficient to discern features in the power spectrum. 

The first CBI results were presented in Padin et al. (2001), hereafter Paper I, using earlier 
versions of the software that did not make use of ut^-plane gridding, and were far too slow to be 
used on larger mosaiced data sets. It was therefore essential to develop a more efficient analysis 
method that would be fast enough to carry out extensive tests on the CBI mosaic data. The software 
package described below has been used to process the first year of CBI data. In the companion 
papers (Mason et al. 2002, hereafter Paper II) and (Pearson et al. 2002, hereafter Paper III), 
the results from passing CBI deep-field data and mosaic data respectively through the pipeline are 
presented. This paper is Paper IV in the series. The output from this pipeline is then used to derive 
constraints on cosmology (Sicvers et al. 2002, hereafter Paper V). Finally, analysis of the excess of 
power at high-^ seen in results shown in Paper II in the context of the Sunyaev-Zcldovich effect is 
carried out, again using the method presented here, in (Bond et al. 2002, hereafter Paper VI). 

An introduction to the properties of the CMB power spectrum, the response of an interfer- 
ometer to the incoming radiation, and the computation of the primary beam arc given in sections 
§2, §3, and §4 respectively. The gridding process is presented in §5, followed by a description of 
the likelihood function and construction of the various covariance matrices in §6. Details on the 
maximum likelihood solution and the calculation of window functions and component bandpowers 
is given in §7, while §8 presents our method for making optimally filtered images from the gridded 
estimators. Finally, a description of the CBI implementation of this method and the performance 
of the pipeline, including demonstrations using simulated CBI datasets, is given in §9, followed by 
a summary and conclusions in §10. 



2. The CMB Power Spectrum 



At small angles, the curvature of the sky is negligible and we can approximate the spherical 
harmonic transform of the the temperature field T{x) in direction x as its Fourier transform T{u) 
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(Bond &; Efstathiou 1987), where u is the conjugate variable to x. We adopt the Fourier convention 
F{u)= f (fxF{x)e-'^''''^-'^ ^ F{^)= f <fuF{u)e^''''^-'' (1) 



of Bracewell (1986), Thompson, Moran, & Swenson (1986), and Taylor, Carilh, & Perley (1999). 
In terms of the multipoles 

'f{uf\K,Ce, £ + l/2«27r|M| (2) 



which we simplify to ^ = 27r|w| for the i > 100 of interest in this paper. For the low levels anisotropy 
seen in the CMB on these scales, it is convenient to give T in units of /xK and thus Q will be in 
units of /xK^. 

Because the CMB is assumed to be a statistically homogeneous Gaussian random field, the 
components of its Fourier transform are independent Gaussian deviates. 

f{u) f*{u')) = C{\u\) S'^iu - u') (3) 



where C(|it|) = C2-n\u\- Because T{x) is real, its transform must be Hermitian, with T{u) = 
T*{—u), and therefore 

[f{u)f{u')) = (f{u)f*{-u')) = C{\u\)SHu + u'). (4) 
Note that it is common to write the CMB power spectrum Cp in a form 

^ C(\u\)^ 271 \u\^Ci\u\) (5) 

(White et al. 1999a; Bond, Jaffe, & Knox 1998, 2000). Constant C corresponds to equal power in 
equal intervals of log £. 

Although the power spectrum Cp is defined in units of brightness temperature, the interfer- 
ometer measurements carry the units of fiux density S'jy (Janskys, 1 Jy = 10^^^ W m^^ Hz^^). In 
particular, the intensity field on the sky Iu{x) has units of specific intensity (W m~^ Hz~^ sr~^ or 
Jy/sr), and thus to convert between 7j, and T we use Iv{x) = f^ii^) T{x) with the Planck factor 

2 X 

h{u) = 2u-'kBg{v,To)/c? g(^u,T^) = x = hu/kj,To (6) 

where g corrects for the blackbody spectrum. Note that We have treated the temperature T as 
small fiuctuations about the mean CMB temperature To = 2.725 K (Mather et al. 1999), and thus 
the g appropriate to Tq is used with g ^ 0.98 at u = 31 GHz. 

We are not restricted to modeling the CMB. For example, we might wish to determine the 
power spectrum of fluctuations in a diflFuse galactic component such as synchrotron emission or 
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thermal dust emission. In this case, one might wish to express / in Jy/sr, but take out a power-law 
spectral shape 

where a is the spectral index, and /o(z^) is the conversion factor that normalizes to the intensity 
Iq at the fiducial frequency uq. Note that this normalization is particularly useful for fitting out 

centimeter-wave foreground emission, which tends to have a power-law spectral index in the range 
— 1 < a < 1 that is significantly difi"erent from that for the thermal CMB (a ~ 2). In addition, 
foregrounds will also tend to have a power spectrum shape difi:erent from that of CMB, which must 
be included in the analysis (see §6.4 below). 



3. Response of the Interferometer 

A visibility Vk formed from the correlation of an interferometer baseline between two antennas 
with projected separation (in the plane perpendicular to the source direction) 6 meters observed at 
wavelength A meters measures (in the absence of noise) the Fourier transform of the sky intensity 
modulated by the response of the antennas (Thompson, Moran, k, Swenson 1986) 

V{u) = J d^xA{x) I{x) e-'^-'i^-^ u = b/X (8) 

where ^(a;) is the primary beam, and u = (u, v) is the conjugate variable to x. For angular 
coordinates x in radians, u has the dimensions of the baseline or aperture in units of the wavelength. 
The Fourier domain is also referred to as the uv-p\ane or aperture plane in interferometry for this 
reason. 

We define the direction cosines 

Xk = {Axk,Ayk) Axk = cos 6k s'm{ak - oq) 

Ayk = sin6k cos(5o — cosSk sinc^o cos{ak — ao) (9) 

between the point at right ascension and declination a/., and the center of the mosaic aQ,6Q. For 
the CBI, data are taken keeping the phase center fixed on the pointing center x^ by shifting the 
phase with the beam and rotating the platform to maintain constant parallactic angle during a 
scan, so that the response to a point source at the center of the field I{x) = S^{x — Xk) is constant, 
and thus 

A{x) = Ak{x-Xk)e^^'^^-''^ (10) 

in equation (8), where A^ is the normalized primary beam response at the observing frequency of 
visibility k. Then, by application of the Fourier shift theorem, it is easy to show that 

Vk = j d^x Ak{x - Xk) lu, (x) e-2'^^^'=-(^-^'=) + Ck 

= J (fv Akiuk - v) 4 (v) e2"^-^'= + Ck (11) 
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where is the Fourier transform of the primary beam A^, and Iu{x) is the sky brightness field 
(expressed in units such as Jy/sr) with transform Iu{v). The instrumental noise on the complex 
visibility measurement is represented by e^. 

The uv-plane resolution of an interferometer in a single pointing is thus limited by the convo- 
lution with A. However, these sub-aperture spatial frequencies can be recovered by using the phase 
gradient in the exponential exp [27riv ■ x^] from a raster of mosaic pointings {x^}, provided that 
the spacing of the pointings is sufficiently small to avoid aliasing as discussed in Appendix A. 

To aid us later on, we introduce a convolution kernel 

Pk{v) = fkAkiuk-v)e^'''^-'''' (12) 

and thus 

Vk = J (fvPk{v)f{v) + ek (13) 

where fk = /T(^'fc) is the Planck conversion factor for the CMB given in equation (6). It is easiest 
to write these in operator notation, with 

V = Pf + e (14) 

where V and e are the visibility and noise vectors respectively, P is our kernel, and T is the 
transform of the temperature field. In this representation T can be thought of as a vector of cells 
in uv space. 



4. The Primary Beam 

In order to determine the response of the array to the radiation field, we need to know the 
primary beam A[x) of the antenna elements and its Fourier transform A{u). In general, for 
each frequency channel, each baseline has a primary beam which is the Fourier transform of the 
cross-correlation of the voltage illumination functions across the aperture of each antenna — see 
Thompson, Moran, &; Swenson (1986) for a detailed treatment of the interferometer response. For 
a real and symmetric primary beam that is identical between antennas, then the transforms are 
symmetric and real we can ignore the differences between cross-correlation and convolution and 
write 

A{u) = g * g 4^ A{x) = \f\ (15) 

for the voltage illumination function g{r, v) across the radius of the aperture r = \r\ at frequency v, 
where g is the Fourier transform of g. The CBI beams have been measured and are nearly identical 
and symmetric, and thus we will use a single mean primary beam and its transform for the array. 
For a heterogeneous array, the individual beams can be used with some added complication. 

For most antennas, such as those used in the CBI, the primary beam width scales linearly 
with observing wavelength, and thus g{r) is approximately constant with wavelength. Then, we 
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can define G{r) as the normalized aperture autocorrelation function, and write 

Akiu) = ^G{\u\Xk) (16) 
for a channel centered at wavelength A^, with 

(fuG{\u\Xk) = -j rdrG{r) (17) 

Jo 

normalizes the response to give unity gain on-sky at the beam center (^(0) = 1). If ^(r) = g{r) /g{0), 
then G{r) = g -k g. 

The two-dimensional primary beam response, A(x), is determined by means of measurements 
of a bright radio source over a two-dimensional grid of offset pointings centered on the source. The 
central lobe of A{x) for the CBI is well approximated by a circular Gaussian, which is characterized 
by its dispersion ax, which is related to the full width at half- maximum (FWHM) Qx by ax = 
ttx/s/S In 2. The Fourier transform of an infinite circular Gaussian is given by 

_ _ 1 _ l'"!^ 1 
A{x) = e ^ 4^ A{u) = - — 5- e 2^ = (18) 

where a^ is the Gaussian dispersion in Fourier space. The function G{r) is therefore 

G{r) = e ^ (19) 

for Gaussian radius Vg = Acr„. For the CBI the measured primary beam (sec Paper III) has ax = 
45^2 X (31GHz/iy), so cr„ = 28.50 at 31 GHz (A = 0.967 cm), which corresponds to Vg = 27.56 cm. 

For the CBI software pipeline, instead of using a Gaussian approximation to G{r) we have 
chosen to model the antenna illumination g{r) as a Gaussian truncated at both the dish edge and 
the secondary blockage radius 

1^1 — dinner 

6 ^ " 'inner < \r\ < D/2 (20) 
\r\>D/2 

where for the CBI antennas Tinner = 7.75 cm. Note that if g{r) and G{r) were infinite circular 
Gaussians, then s = Vg. A best-fit taper parameter s is obtained using the measured primary beam 
A, giving s = 30.753 cm or an edge taper of 0.118 (—18.6 db of power) at the dish edge. We then 
numerically tabulate the autocorrelation G{r) assuming s = 30.753 cm which is then interpolated 
in the code when A is required. This model is a better fit to the observed beam than a pure 
Gaussian beam (see Figure 1 in Paper III for a plot of this model). 

The resolution in uv or i space is set by the width of Ak (u) . For a Gaussian approximation to 
the beam, the dispersion in multipole £ is ai = 27r(T„ = 1/ax, and the FWHM is = 8 ln2/aa;. For 
ttx = 45f2 at 31 GHz we have FWHM ag = 422 {ag = 179). Given that features are expected in the 
power spectrum of widths significantly less than this, it is highly desirable to reduce the effective 
resolution width of the CBI by at least a factor three using mosaicing. 



9{r) 
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5. 



Gridded Estimators 



The principal problem in using likelihood (see §6) to determine confidence limits on the power 
spectrum for CBI data is the large number of visibilities compounded by the large number of mosaic 
pointings (typically 7 x 6 or larger). Even a modest reduction in the number of matrix elements 

passed to the likelihood calculation will greatly aid the computation. This suggests that we grid 
the visibilities before computing the likelihood function. For an effective resolution in the aperture 
plane determined by the primary beam and mosaic size, there is little use in sampling below this 
smearing scale, and we can define an optimum gridding scheme which minimizes the quantity of 
data and information loss (the gridding is a form of compression). 

We implement this by defining estimators A(u) for the true complex brightness transform 
which are linear combinations of the measured visibilities. These estimators bin together data 
from the different frequency bands and mosaic pointings. Thus, a direct sum of visibilities taken 
at the same u but over the whole mosaic x will result in an estimator that has a higher effective 
resolution in the uv plane. The result is that we can speed up the likelihood estimation at the cost 
of complicating the covariance matrix. In general, this matrix can be computed relatively quickly 
as it is a A^^ process, and thus this is a worthwhile trade-off versus the cost of calculating the 
likelihood. The estimators derived in Appendix A are not orthogonal combinations of the original 
visibilities, and thus some information loss is expected. However, the tests performed in §9.1 show 
that these estimators arc unbiased, and comparisons to results obtained using the visibilities directly 
show that there is no noticeable loss in sensitivity. Thus, our gridding can be considered to be an 
efficient form of (lossy) compression using the beam as a signal template. 

In Appendix A, we argue that a Aj formed by a linear combination of visibilities will give a 
estimate of the weighted average of / or T around uv locus Wj. We have from equation (A13) 



where the kernel Q is defined in equation (A13) and the kernel for the conjugate visibilities Q is 
defined in equation (A17). In particular, 



A = QV + QV* 



(21) 



Q,, = ^i*(-w,-w,)e-^---="'= (22) 



where Zi the normalization factor given in equation (A21), and w/j = e^^ is the visibility weight 
given in equation (A19). 



By operating with the gridding kernel on equation (14), we get 



A = RT + n 



R=QP+QP 



n = Q e + Q e* 



(23) 



where we define R as the convolution kernel that operates on the transform of the intensity (the 
gridded version of P), and n is the gridded noise. The conjugate to P defined in equation (12) is 
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given by 

Pk{v) = fk Ak{-Uk-v) e^-^--'^ , (24) 

which gathers the conjugate visibihties under the transformation Uk — —Uk- 

Although it is not necessary to do so, it is convenient to construct the Aj on a regular lattice 
in Ui with a spacing du- Thus the grid "cells" represented by the Aj represent an interpolation 
using Q of the visibilities onto the uv-plane. This will be useful when using the estimators to form 
filtered images (§8). 

If it is desired that the visibilities be used directly, for example when the datasets are small, 
then the ungriddcd case can be recovered by setting Qj^ = 6ik and Q^/^ = 0, giving A. = V and 
R = P, with no loss of generality in the derivations. 



6. The Likelihood Function 

To carry out the power spectrum estimation, wc form the likelihood of the data given covariancc 
matrices for the signal, noise, and foregrounds. Since the estimators are linear combinations of the 
visibilities, which we assume are made up of Gaussian noise and Gaussian signal components, we 
can use a multivariate Gaussian probability distribution to describe the estimators also. Because 
A is complex, it is easier to deal with the real and imaginary parts by packing them together in a 
double-length real vector 

y ImA J 

written here as a column- vector of length 2Nesi. 

The log-likelihood function for a real multivariate Gaussian probability distribution is 

In L{x\q) = -iVest In 27r - ^ ln(det C) - ^ C"^ d (26) 



(25) 



where d* is the transpose of d, and 



(Re A Re A*) (ReAImA*^ 



(ImA Re A*) (ImAImA*^ 



(27) 



is a block matrix of the real and imaginary covariances. The vector q represents the parameters of 
the model or theory against which the data is being measured (see below). These parameters are 
contained in C. 

In terms of A and A*, we can write 

ReA = ^(A + A*) ImA = ^ (A -A*) (28) 



-10- 



and therefore 



(Re A Re A*) 
(ImAImA*) 
(Re Aim A*) 
(ImAReA*) 



Re 
Re 
Im 
Im 



(A At) + (A A') 
(A At) - (A A*) 
(A At) - (A A*) 
(A At) + (A A*) 



(29) 



where At = (A*)* is the Hermitian transpose of A (a row-vector containing the complex conjugate 
of a column- vector) , and A At is the tensor or outer product of A and At , which is a matrix with 
elements A^. 

It is important to include the covariances of A A* as well as those for A At. Normally, only 
one of a given visibility Vk or its conjugate will correlate with V^/. However, for short baselines 
b < ^/2D (less than 127.3 cm for the 90 cm CBI dishes), there may be overlap between the support 
for a given visibility and both another visibility and its conjugate, as shown in Figure 1, and thus 
both may be nonzero. Note that the correlation between distant conjugate pairs is small, since the 
overlap occurs far out in the antenna response A, although it is non-negligible on the shortest CBI 
baselines where the overlap occurs at the 0.57D point (illustrated in Figure 1) for perpendicular 
1-meter baselines with the beam response ~ 30%. Outside the baseline range b > y/2D one of 
{V* Vk') or {Vk Vk') will be zero. 

The covariance matrix C can be split into a sum of independent contributions from instru- 
mental noise C'^, the CMB signal C^, and foreground signals C®^^ and C^®. We further split 
into a sum of terms Cg from separate £ bands of the power spectrum. 



C = C 



N 



B ^ ysrc ^ 



+ lies 



(30) 



The factors {qb, B = 1, . . . ,Nb} are the "bandpowers" (Bond, Jaffe, & Knox 1998) for bins with 

centers at i = Ib, and are the model parameters to be determined by maximizing the likelihood. 
The factor q^-cs is amplitude of the covariance due to a residual Gaussian foreground, and ^src is the 
amplitude of the covariance contributed by known point sources; there may more than one of each 
of these types of foreground covariance matrices. The gsrc and q^es can be treated as adjustable 
parameters to determined by maximum likelihood, or they can be held fixed at a priori values, in 
which case C^'''^ and are constraint matrices with their corresponding terms in equation (30) 
behaving like additional noise terms. 

In the following sections we consider each of the terms C'^, C^, C®"^^, and C"^^^ in turn. If we 
write 

M = (AAt) M = (AA*) (31) 

then in each case we calculate the contributions to the covariance matrix for the real and imaginary 
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parts of the estimators using equations (27) and (29) 

/ iRc[M + M] -ilm[M-M]\ 

1^ ilm[M + M] iRe[M-M] J ^ ' 

with the individual covariance matrices given by insertion of the appropriate contribution to M 
and M for that component, e.g. and to compute the block elements of C^. 



6.1. The Noise Covariance Matrix 

The instrumental noise correlations are assumed to be Gaussian and independent between 
different baselines, and frequency channels. For the CBI, tests have been carried out on the data 

which show this to be true to a high level of accuracy. In this case, the noise contribution to the 
real and imaginary parts of the visibilities are independent zero-mean normal deviates with 

(ReefeRecfe') = (Imefelmefc/) = e\5kk' (Reefclmefc/) = (33) 

and thus we can write 

(eet)=E (ee*)=0 (34) 

for real noise matrix E, where E'^fe' = 2e^ <5fejfc'- 

It can be shown that the noise contributions n to the estimators defined in equation (23) have 
the contributions to the covariance elements M and M defined in equation (31) given by 

= (nnt) = QEQt + QEQ^ 

ivr + («^^) 

M = (nn*) = QEQ +QEQ* 

using the covariances of e given in equation (34). This is assembled into the covariance matrix 
using equation (32) . In general, the gridding kernel Q will map a given visibility to more than one 
estimator, and thus C'^ will have non-zero off-diagonal elements. Furthermore, if there are noise 
correlations between baselines or channels, then the structure of C'^ will be even more complicated. 



6.2. The CMB Signal Covariance Matrix 

The CMB contribution to the visibility covariance matrix is given by the covariance of the 
RT term in equation (23) 



MS = R(tT^)Rt M^ = R(TT*)R* (36) 



wher^TT') and (TT ) are given in equations (3) and (4)respectively. Then, the elements of M^ 
and M for estimators i and j are 

Mfj = J (fvC{\v\)Ri{v)R*{v) = 2Tr J dwwC{w)Wij{w) 
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with 

r2-K 



1 

w^uM = :r / deRi{w,e)R*{m,e) 

JO 

= — / deRi{w,e)Rj{w,e-Tr) (38) 
2vr Jo 

where to aid in breaking up tlie CMB covariance matrices into bands we write the integrations in 
terms of polar Fourier coordinates {u,v) (cc, ^) (ro = \v\). 

As an illustration, consider the case without gridding. Then, R = P, and using equation (12) 
in (37) we get 

Mik' = fk fk' j d^v C{\v\) Akiuk - v) Al,{uk' - t,)e2--(-/=— fc') 

= fkfk' j 'fvC{\v\)Ak{uk-v)Ak\uk' + v)e^^'^-^''''-''y^ (39) 

for the covariance matrix element between visibilities and Vj./ . 

We furthermore write the radial integral over = €/27r as a sum with respect to Q of equation 

(5) 

where Wnj is the variance window function (e.g. Knox 1999). 

We define the bandpowers {qb, B = 1,... , Nb} by constructing Q piecewise with respect to a 
fiducial shape cf^^^^ 

Ce = ^qBCf'^''XBi (41) 
B 

where 

breaks the power spectrum into non-overlapping bands. The standard choice for the shape is 
^shape _ ^ equal power per log-£ interval, with qs then giving the bandpowers in units of T^. 
Then, to calculate C^, we construct band versions of the covariance matrix elements in equation 
(36) _ 

Ml = E ^ ^t""" XBi m| = ^ ^ Cf XBe (43) 
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where = YIb and M = YIb Mg. These are then combined following the prescrip- 

tion in equation (32) to assemble the Cg. 

The variance window function Wij{v) is the convolution of the Ai{v) and Aj{v), and thus 
its width is characteristic of the sqiiarc of the Fourier transforms primary beam, or FWHM A£ ~ 
ail\f2. Thus we would expect in a single field to be able to achieve a limiting resolution of A£ ~ 300 
for = 422 at 31 GHz. This will be increased by the mosaicing by a factor roughly equal to the 
extent of the half-power width of the mosaic relative to that of a single field. In practice, the 
limiting useful width for the i bins for the bandpowers will be set by the band-band correlations 
introduced in the maximum likelihood estimation procedure (see §7 and §9.1 for further discussion 
and examples). 



6.3. Known Point-Source Constraint Matrices 

Consider a set of iVc point sources at positions Xc with flux densities Sciy) (c = 1, . . . ,7Vc). 
The intensity field at frequency v is then given by 

l,{x) = Y,S,{v)b''{x-x^) (44) 

c 

which is assumed to be uncorrelated with other intensity components like the CMB. The effect V^'^ 
on the visibilities (e.g. eq.[ll])is then given by the sum over sources 

= E = Akixc - Xk) e-2--/=-(-c-=.fc) (45) 

c 

where Vck is the contribution to visibility k of source c. We assume that the positions of the sources 
can be determined with negligible uncertainty through radio surveys, and that the errors are due 
to uncertainties in the measurements of the flux densities. Then, the covariance between the source 
contributions to visibilities k and k' is 

{vrvkn = J2J2(^o{^k)sA^k'))Ak{xc-xk)Ai,{xc'-xk') 

c d 

X g-27rmfc-(a;c-a:fe) g27rmj,/-(a;^/-a;j./) (^g) 

where {Sciyk) 'S'c'(^fc')) the flux density covariance matrix between sources c and c' at frequencies 
ffc and vy respectively. There is a similar covariance matrix ^^'^V^'^)- T These can be passed 
through the gridding procedure using equation (21) to make 

^src ^ Q y-src _^ q y src* (47) 

and used to construct the covariance elements 



(48) 
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using equation (31). 

This covariance matrix can be greatly simplified if we can subtract off the mean source flux 
densities, leaving a zero-mean residual error. Let the true source flux density Sdi') be the sum of 
the measured flux density S°^^{i') and an error SSdiy). If our measurements of these foreground 
sources are accurate, then the residuals 6Sc{i^) should be uncorrelated between sources (they are 
due to measurement errors) and have zero mean. In this case, we can make corrected visibilities 

ycor 

Vr = Vk-J2 "^ck" = E ^c'^'i'^k) Akix, - Xk) e-2--fc-(— (49) 

c c 

to be used in place of V in subsequent analysis. Then, we are left with the fluctuating component 

c 

6V,k = dSMAk{xc-Xk)e-^^'^'^<^^-^'^^ (50) 

which we must deal with statistically. The covariance between the source error contributions to 
the visibilities, assuming the flux density errors are independent between sources (but not between 
frequency channels for the same source), is given by 

{6VrSV^F*) = ^{SSMSS.M) Akix, - x^) Al,ixc - x^') 

c 

and similarly for {SV^"^^ 5V^J^) . Finally, if the covariance is separable, e.g. 

{5S,{u)5Sc{u')) = asciy) (rsc{y') (52) 

then we can write 

/XT/src rT/src*\ \ ^ _.src _src* 

c 

= asMAk{x,-Xk)e-''-'^^<^^-^>'\ (53) 

The other covariance {^V^'^'^ dV^f^) can be computed in the same way. Because we have assumed 
that the covariance is separable, we can speed up the covariance calculation as only the vector a^^"^ 
for each source is needed. We can grid this onto the estimators 



A^'^^ = Q<^ + Qo-r* (54) 

and then 

]y[src ^ ^ ^src ^src f ^''^ ^ ^ ^src ^src t (55) 

c c 

which are used to build C^'^'^. 
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There are two components to the source flux density uncertainties crsc{^k)^ from the uncer- 
tainties on the source frequency spectrum, and the other from the uncertainties on the flux density 
measurements and any extrapolation of the measured flux densities to the observing frequencies 
Vk (using the estimated source spectrum). As an example, consider a source with a flux density 
Sc{vq) measured with standard deviation aso at frequency fo, and a known power-law frequency 
spectrum with spectral index a, 

Sc{i^k) = Sc{yQ) fivk/^Q.Oi) f{i^k/^Q,<^)=[^^ ■ (56) 

Then, it is easy to show that 

(^Sc{j^k)/Sc{i^k) = (^so/Sc{vq) (57) 
with the fractional uncertainty in the flux density (Tsc/Sc remaining independent of the frequency. 

On the other hand, consider the case where there is now an uncertainty (Tq, in the spectral 
index between i/^ and vq. Then, our extrapolation factor /(i^/z/q, a), which we write as 

/(z./z.o,a) = e"^^('^/'^°), (58) 

propagates to the extrapolated flux density as 

(^Scivk)/ Sc{vk) = ln(z^/ vq) Oa (59) 

which can be negative — for two channels flanking the fiducial frequency (e.g. u < < u') 
the errors will be anti-correlated. Note that we have approximated the resulting distribution as 
Gaussian. In general it is not, e.g. for a Gaussian distribution in a we find a log-normal distribution 
in S{v). 

Although the dominant spectral error is due to the extrapolation from a frequency outside 
the range of the CMB instrument, there is an additional error due to an error in the spectral 
index over the frequency channels i/^ of the visibilities. This is as if you extrapolated using one 
spectrum appropriate for the band center u of the instrument, but when the flux densities S°^^{i'k) 
are extrapolated from band center S^^^ {i>) there is an error from using the wrong a over the band. 
This is handled using equation (59) with another ctq appropriate to the uncertainty in the spectral 
index over the Vk- 

For the CBI analysis, we have approximated both the flux density error and the spectral 
extrapolation error as a single equivalent flux density error. For tlic CBI, the frequency span (26- 
36 GHz, or dv/v = ±16%) is small enough that we can approximate the spectral index uncertainty 
as an effective flux density uncertainty ac extrapolated to band center v from uq using 

crsc{i^k) ~ f{^/^,a)uc al = f{i'/i'o,ao) (asQ + S^ [ln{i? / i/o)f a^j (60) 

where a need not equal oq and should reflect the spectral index over the observing band, not the 
one used for extrapolation from uq. 
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In principle, if the true mean flux densities for the sources are correctly subtracted from the 
visibilities, and the covariance matrix C**'''^ is built using the correct elements {5Sc{i^k)^Sc'{'yk')) i 
then inclusion of this as a noise term in C using ^src = 1 would remove the effects of these sources 
from our power spectrum estimation in a statistical sense. However, there are a number of factors 
that make this difficult. If the source flux density measurements have a calibration error, then 
the errors will not be independent between sources. Also, the fainter sources (which are still 
significant contributors to the signal) have Atix densities that arc often extrapolated from much 
lower frequencies (e.g. the "NVSS" sources in Paper II and Paper III). Furthermore, since there are 
a relatively small number of discrete sources contributing to a given field, it is not clear we are in the 
statistical limit where the flux density covariance is an accurate description of what is happening to 
the data. For these reasons, for the CBI analysis we treat the covariance matrix C^"^^ constructed 
using the approximation in equation (60) as a constraint matrix for the nuisance parameters due to 
the sources, and set Qstc to a high enough amplitude to project out the contaminated modes in the 
data. Because the source modes are spread out by the effect of the synthesized beam (the "point- 
spread function" in imaging terms), setting ^src to too high a value will start to down- weight modes 
that are not significantly contaminated, while too low a value will eat into the noise and CMB 
signal power in those modes without down-weighting them sufficiently thus biasing the affected 
bandpowers low. The exact values to be used thus depend upon the signal and noise levels in the 
data; we refer the reader to Papers II and III for descriptions of what was chosen for the CBI 
analysis. See Bond, Jaffe, & Knox (1998) for a description of the constraint matrix formalism and 
the technique of projection. 



6.4. Gaussian Foregrounds and Residual Point Sources 

In §2 it was mentioned that a single foreground component could be modeled with a modified 
covariance matrix, power spectrum shape and frequency dependence. As long as these foregrounds 
can be treated as a Gaussian random field, they can be processed in the same manner as the CMB. 
Therefore, once the amplitude and shape of the foreground fluctuation power spectrum Cj:es{v) is 
input, we compute the foreground covariance matrix elements 

^res ^ _^ qes ^^^^ ^ (61) 

e e 

where the variance window functions W'^'^ and w'^'^'^ are given by substituting for R in equation 
(38) a new IV''''^ built from a kernel 

priv) = /r M^k - v) 62---"= (62) 

using a frequency factor f^.^^ = ^'^^{I'k) appropriate to the foreground in question. The matrix 
is then obtained by substitution of M"^^^ and M^^^^ as usual using equation (32). Although it 
is possible to break up the Gaussian foreground component into bands as we did the CMB, it is 
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preferable to compute the foreground covariance matrix in a single band using its shape C^^, to 
reduce the degeneracy with the CMB — you cannot distinguish between the two in narrow £ bands 
where the shapes are unimportant. 

An example of a foreground that strongly affects the CBI data is that from point sources below 
the limit for subtraction contaminating the CBI fields. This residual statistical background, in the 
limit where there arc many sources per field, can be modeled as a white noise Gaussian field with 
constant angular power spectrum and power-law frequency spectrum. Each individual source has a 
flux density drawn from a differential number count distribution dN{Si,)/dS which represents the 
number of sources per steradian with flux densities between and Sv + dS at observing frequency 
u. The angular clustering in these sources is very small, and can be neglected. 

The contribution of a source c to visibility was given by Vcu in equation (45). The sources 
are independently distributed in flux density and on the sky, so 

{Vk Vv) = (E ^'^^''k) ScM Ak{x, - Xk) Al,{xc - Xk>) e2^-.-(-c-x,)g-2.i.,,.(xe-=.,,)^ 

c 

= ^ 'S'c(fjt) Sc{yk')) Bkk' 

c 

= a''\vk,Vk')Bkk' (63) 
where the angular average can be written as an integral over 

Bkk' = j cFx Ak{x - Xk) Al,{x - Xk') e2-^"'=-('^-^'=) ^.-^^i^yi^-^y) (54) 
with J7 as the normalizing solid angle. This integral is just a Fourier transform, and so 

Bkk' = j d^v Ak{uk - v) Al, {uk' - v) e^^'^<^u-^k') (65) 

which is the same as the CMB visibility covariance matrix Mkk' in equation (39) with fk = fk' = ^ 
and C{v) = 1. Similarly, the other covariance {VkVk') reduces to Mkk'- Thus, in the stochastic 
limit the residual source background behaves as a Gaussian random field and can thus be treated 
as we do the CMB signal in §6.2 but with a power spectrum shape Q = C^^^{uk,i'k') which is 
constant over i for a given pair of frequency channels. 

The amplitude of the covariance matrix is the ensemble average of the source power per solid 
angle, which is obtained by integration over the flux density and spectral index distributions 

C {i^k,i^k')= dSS / dap{a\S,i^o) I — —j (66) 

where we have again assumed that the spectrum is a power-law with spectral index a. over the 
range of interest for the f^, and integrate over the number counts over the flux density range from 
'S'min to -Smax- We also assume that there is a large number of these faint sources over the solid 
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angles of interest (e.g. the CBI primary beam) and thus the Poisson contribution to the probabihty 
can be ignored and we can use the mean source density given by the number counts dN/dS at the 
fiducial frequency vq for which S is given. The spectral index distribution as a function of flux 
density p{oi\S,uq) must be estimated from radio surveys, though it can be uncertain at the high 
frequencies and faint levels at which the CMB experiments are carried out. If p(a|S', vq) = p{a\vQ) 
and thus is independent of flux density, then it can be shown (e.g. Appendix B) that the integrals 
in equation (66) can be evaluated at a single frequency v in the band and scaled using an effective 
spectral index a^s 

C^^{^k.^k') = Crffff ff = (^)"^' (67) 

where C^^^ is the amplitude of the fluctuation power per solid angle (in units of Jy^/sr) at i'. In 
terms of the logarithmic average C for the CMB, C^^^ = /2'k which rises at high i with respect 

to the CMB. See Appendix B for an example analytic calculation using power-law source counts 
and a Gaussian spectral index distribution. 

The frequency range of the CBI is insufficient to distinguish nonthermal foreground emission 
from the thermal CMB, and thus this is treated as a constraint matrix (i.e. ^res is not solved for 
as a parameter). Therefore, in the CBI analysis we construct the covariance matrix C^^^ using 
the matrix elements in equation (61) built assuming unit power (1 Jy^/sr) and the frequency 
dependence fk = from equation (67). The value used for g^cs is equal to the source fluctuation 
power C'^^^^ calculated as an a priori estimate based on knowledge of the residual foreground source 
populations (see Appendix B). 

6.5. Other Signal Components 

We are not restricted to CMB, Gaussian foreground, and discrete point sources as the com- 
ponents of our signal or noise in the covariance matrix C in equation (30). This approach can be 
generalized to deal with other signals of interest. For example, extended sources with a known 
profile, such as the Sunyaev-Zeldovich Effect from clusters of galaxies, could be modeled either 
analytically or numerically giving a power spectrum shape (e.g. Bond & Myers 1996). In the case 
of a signal with a known distribution on the sky, a template can be used. Examples of this include 
dust emission in the millimeter- wave bands, or the anomalous centimeter- wave emission observed at 
OVRO (Leitch et al. 1997) and by COBE (Kogut et al. 1996). In particular, the latter foreground, 
which is posited as due to spinning dust grains by Draine & Lazarian (1999), correlates with the 
100/xm dust emission as measured by IRAS and DIRBE, and thus a template of emission can be 
constructed. 
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6.6. Differencing 

Unfortunately, with its low intrinsic fringe rates and extremely short (< 90 A) spacings, the CBI 
is susceptible to ground pickup. To remove this, we observe for each field a trailing field displaced 
gm right ascension 8™ later, and difference the corresponding visibilities. Therefore, we must 
take this differencing into account in our correlation analysis. This effectively modifies the window 
function, quenching low spatial frequencies further. 

Let us write 

for switching offset Axj- (e.g 
from equation (11) we find 

= A y Muk - v) f{v) e^™-^^ [1 - e2'^^^-^^'=] + (69) 

where the switched noise e^,^ = e™^^"^ — e^''^'^ In terms of the kernel of equation (12), 

= j cfv P^iv) f{v) + er Priv) = Pk{v) [1 - 6^--^-'=] (70) 

and thus for our switched visibilities wc compute everything as before, but substituting P^" for Pj-. 

Note that if the trail field offsets Ax were constant in arc length rather than in Right Ascension 
(this is approximately true since the declination range of the mosaic is limited) we could write the 
convolution kernel as 

P^iv) PI7*iv) = Pkiv) P;,{v) [2 - 2 cos(27rt; • Ao;)] (71) 

where the leading factor of 2 dominates (you essentially get twice the CMB power). Note that a 
noticeable effect of the differencing is that the window function will have a ripple of "wavelength" 
Ax~^ superimposed on its envelope. For example, the 8™ switching in RA that the CBI uses 
corresponds to Ax = 2° at the celestial equator, and thus the ripple has Ax~^ = 28.6 in u. This 
corresponds to 180 in £, but is azimuthally averaged in the u?>-plane and thus the peak-to-peak 
amplitude is reduced. 



= Vr''-Vt'' x'^-'' = Xk + Axk (68) 

. 8°^ in RA 2° for the CBI fields near the celestial equator). Then, 



7. Solving the Likelihood Equation 

We have expressed the estimators as a real vector d and obtained expressions for the compo- 
nents of its covariance matrix, and now turn to the problem of solving for the maximum likelihood 
estimators for the bandpowers using equation (26). As shown below, we will be carrying out a large 
number of matrix operations using C and its component matrices (e.g. C^, C^, etc.) and thus 
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these will need factorization. Because C is positive definite we use optimized Cholesky decompo- 
sition routines (DCHDC from LINPACK, or DPOTRF from LAPACK)^ to carry out the required 
factorizations. 

The large number of visibilities times the number of mosaic pointings makes this computa- 
tion extremely costly (the matrix inversions and/or solution of systems of equations are order N'^ 
processes!), especially for a large number of bands Nb- Clever perturbative or gradient search 
methods can help to reduce the overhead in finding the maximum in parameter space. One such 
method is the quadratic relaxation technique of Bond, Jaffe, & Knox (1998, BJK). To summa- 
rize here, if one Taylor expands the log-likelihood around the maximum likelihood bandpowers 
q = {qB,B = 1, . . . , Nb} to second order 



1 / - t- X , r / ~ X d In L(q) ^ 1 v-^ v-^ \nL(q) ^ 

In L(g + 6q) = In L(,) + '''^2^^ 

then we can move toward the maximum using the quadratic approximation 



(72) 



E 



'd'^ InL(g) 
dqB dqB' 



d InL(q) 
dqB' 



(73) 



The first derivative (gradient) is given by 
d In L{q] 



dqB' 2 

and the second derivative (curvature matrix) by 

^2 InL(q) 



-Tvf(dd*-C)(C-i^C-i) 
dqB' 



^BB' = —■ 



dqB dqB' 



= Tr 



(dd^ - c)(c-^ l^c-^ 1^ c-i - Ic-i _ 

oqB oqs' 2 oqBoqB 



-I- -Tr 
2 



dq 



B 



dqB', 



(74) 



(75) 



Note that the partial derivatives of the covariance matrix are just the band signal covariance 
matrices dC/dqs = defined above. 

The final approximation is to replace the curvature matrix with its expectation value, which 
is the Fisher information matrix 



Fbb' = {J'bb') = ^T^ C| C|,] 



(76) 



^available from http : / / www . netlib . org/ 
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This yields 



)] 



(77) 



B' 



for the iterative correction to the bandpowers. This amounts to making a quadratic approximation 
to the shape of the hkeUhood function around the maximum, and iteratively approaching it. At 
each step, the total covariance matrix C must be updated using the new bandpowers q + 5q. A 
convergence criterion based on the magnitude of the corrections Sqb will allow approach to the true 
{qb} to be controlled. 

The inverse of the Fisher matrix [F~^\bb' evaluated at maximum likelihood is the covariance 
matrix of the parameters (Bond, Jaffe, & Knox 1998). The diagonals [F~^]bb give an estimated 
Gaussian error bar for the derived bandpowers {qb}-, though the full Fisher matrix must be used 
to take the (usually significant) band-band correlations into account. As the width of the I bins 
for the bands B is reduced, anti-correlation between adjacent bands increases due to the intrinsic 
^-space resolution of the data. 

The presence of known or residual point source foregrounds in equation (30) is dealt with 
by either fixing the amplitudes Qstc oi" 9res ^nd treating Qsvc C^'"'' or q^es 

C^^^ as additions to the 

noise matrix C'^, or by solving for the ^src or gres and treating them as extra bandpowers qb with 
associated entries in the Fisher matrix. In practice, for the CBI, it is necessary to hold fixed 
the Qrcs because the contribution from the source foreground with a white noise power spectrum 
and appropriate frequency spectrum is largely indistinguishable from an overall offset of the CMB 
power spectrum. In addition, the uncertainties on the individual known source contributions to an 
aggregate C^'^'^ will be substantial, and thus solving for a single amplitude q'src will not be as useful 
as it might appear. In this case, the C^''^ acts as a constraint matrix and the ^grc can be set to 
an arbitrarily high value, which will effectively project out the modes corresponding to the known 
sources by down-weighting the relevant combinations of the estimators in the likelihood (Bond, 
Jaffe, k Knox 1998; Bond & Crittenden 2001). 



Consider observations taken of separate sets of single fields or mosaics / where there is effec- 
tively no correlation between fields from separate / and the fields within a given set / are related 
by the mosaic covariance given in the previous sections. In this case, we can assemble a giant data 
vector 



from the M individual field vectors d/ (e.g. eq.[25]), with the block diagonal covariance matrix 



7.1. Combination of Independent Datasets 
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/ c 



1 



C = 



(79) 



v 
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which in turn can be written as sums of block-diagonal noise and signal covariance matrices 
and C^j etc., with blocks given by Cj and C^^, etc. Because they are block diagonal, we can 
write the log-likelihood in equation (26) as the sum over datasets 

In L = - ln(27r) ^ iV/ - - ln(det C) - ^ C"^ D 



= - ln(27r) J2^f-ljl l^(det C/) - i ^ d} Cf df. 



(80) 



/ 



/ 



We proceed as before, with the same bandpowers {qs} and with the block-diagonal band 
covariance matrix 



dC 



V 
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BM / 



and thus all matrices are block-diagonal and composed of the individual single field or mosaic 
matrices. Therefore, 

Fbb' = 2 X] [^/ ^ ^Bf 

= lj2[^~']BB'J2'^[idfd^f-Cf){Cj'C%.fCj')\ (82) 

B' f 

which is used to iteratively approach the {qb} using the BJK scheme as in the single dataset case. 



7.2. The Bandpower Window Function 

To compare the bandpowers obtained from the data to model power spectra we need to define 
a set of filter functions which project models Q into a set of bandpowers Cb 



(83) 



as in Bond, Jaffe, Sz Knox (1998). In the ensemble limit, the expectation value {{xx^ — C^)) 
will approach the underlying signal covariance matrix C^. We can then use the expression for the 
minimum variance estimate of the bandpower to derive the filter functions WP Knox (1999). Since 



M = JE[^"']^b'^[(c-^c|,c-^)cS] 



(84) 



and 



B 



(85) 
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the normalized filter functions can be computed using the band averaged Fisher matrix (e.g. eq.[76]) 



BB' 



TV 



(c-^c|,c 



(86) 



with respect to the cf^'^^'^ = i that is built into the C^. Because of the XBi used in the construction 
of the in equation (43), 

Y.XB'eWf/£ = SBB' (87) 

and thus Wf /i is orthonormal with respect to the bands defined by XBi- 

Calculating the filter functions at each i becomes somewhat prohibitive in both processor time 
(the problem scales as an extra + 21^3^^^ operations) and storage since the calculation of 
equation (86) can only proceed once we have relaxed to the maximum likelihood solution. For this 
reason, in practice we sample the full filter functions in bands at intervals Bf where the Bf are 
narrower than the bands B, with 



BB' 



Tr 



(c-ic|, c-i)c|^ 



(88) 



In principle, this is equivalent to assuming a flat window over the 'fine' band Bf and as long as 
the curvatTirc of the exact window function is small over the intervals Bf this should provide an 
adequate sampling of the continuous limit. 

To obtain model bandpowers we can then either interpolate the samples W^j, to obtain an 
approximate form for Wf for use in equation (83) or pre- average the model spectrum over the fine 
bands Bf as 



CB = Y.(wE,/iB,)cf^ 



(flat) 



(89) 



where C^^^^^ are bandpowers calculated using fiat filters iy^f^ = 1). We find that a fine band 
width ^(-Bf ~ 20 is sufficient to adequately sample the window functions and ensure normality and 
orthogonality to within 0.5% with respect to integration over the bands (e.g. eq.[87]). Example 
window functions calculated in this manner for mock deep fields and mosaics are shown in the lower 
panels of Figure 2 and Figure 3 respectively. 



7.3. Component Bandpowers 

A further complication at the parameter end of the process is that the likelihood of the band- 
powers cannot be assumed to be a Gaussian. This is especially so in cases where the error in 
the bandpowers is sample or cosmic variance limited. Assuming the bandpowers to be Gaussian 
distributed can lead to the well known problem of cosmic bias where the likelihood of low power 
models can be overestimated and conversely that of high power models can be underestimated. 
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Bond, Jaffe, & Knox (2000) have shown how one can avoid this problem while still retaining Gaus- 

sianity in the analysis by treating certain functions of the bandpowcrs as Gaussian distributed. 
Very good fits to the non-Gaussian distribution of the bandpowcrs can be obtained by use of the 
offset log-normal and equal variance approximations to the likelihood. 

Both approximations use offsets in the bandpowcrs which describe the contributions to the 
error in the bandpowcrs due to components other than the CMB. For the range of scales probed 
by instruments such as CBI these components will include the foregrounds such as point sources in 
addition to the usual noise 'on the sky' offset x§ ~ Yle XBi xe, where is the offset due to the noise 
contribution to the error such that the quantity Z£ = ln(Q + x^) has a normal distribution Bond, 
Jaffe, & Knox (2000). For accmatc parameter fits we therefore require estimates of bandpowcrs for 
all the components making up the total covariance C. An approximation for these can be obtained 
by modifying the minimum variance estimator for the bandpowcrs qb at the maximum likelihood 

9b = IY1 ^^bb']-' TV [(C-i C|, C-i) C^] (90) 

B' 

where we have substituted in equation (77) for the observed measure for the signal covariance 
(dd* - CN). Wc then set to the noise C , foreground source Qstc C^^'^ or Gaussian residual 
foreground grcs C'^*^ covariance components as desired (or use the maximum likelihood values qsrc 
and ^res if these are included as parameters in the solution rather than being fixed). Examples of 
these are shown in Paper II and Paper III for the deep field data and mosaic data respectively. 
The offset to the log-normal is then obtained by summing the over the components, xb 
5b + lW + (^-S- Bond & Crittenden 2001). This formalism is used in Paper V to approximate 
the shapes of the likelihood functions in order to derive limits on the cosmological parameters. 



8. Imaging from the Gridded Estimators 

Although not the primary goal of this method, an image can be constructed by Fourier trans- 
forming back to the sky plane using equation (1). If the estimators Aj are constructed on a regular 
lattice in Uj with spacing du and uv extent Ldu-, then the resulting image will have an extent on 
the sky given by the inverse of the spacing and a resolution given hy d~^/L. In the continuum 
limit (see Appendix A), we can define an estimator T{x) for the temperature field T{x) 

f{x) = J d'^uA{u)e^^'''-'^ (91) 

where A(ti) is the continuous functional form (e.g. eq[A2]) for the estimators, with Aj = A{ui). In 
practice the lattice of estimators Aj is embedded in a wider grid padded with zero in the unsampled 
cells, and an FFT is carried out. 

For our standard gridding normalization Zi = z^^^ given in equation (A21), the units of A 
will be flux density units (Jy) and thus its inverse Fourier transform will produce a map in units 
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of Jy/beam, where the beam area is given by the point-spread-function (PSF) of the image. For 
a single field, the PSF is just the image generated using equation (91) using estimators A?^^ 
computed by introducing unit "visibilities" into equation (21) 

AP^ = J2{Qik + Qik}- (92) 

k 

The situation for the mosaics is somewhat more complicated, as the mosaic offsets must be taken 
into account in constructing equivalent visibilities for point sources 

k 

V^^^ix) = Afe(a;-a;fe)e-2™'=-(") (93) 

obtained by evaluating equation (11) with no noise and I{x) = 6'^{x — x). In this case one would 
evaluate the PSF at various positions x in the map. 

Because our estimators use the kernel Q as given in equation (22) which includes the beam 
transform A, we are effectively multiplying the image on the sky by the primary beam squared — 
once in the kernel, and once due to the instrument itself (e.g. eq.[12]). Images made directly from 
the A will therefore be strongly attenuated in the (noisy) outskirts. 

As mentioned in Appendix A, the optimal weighting for the imaging of the CMB component 
is to use the Planck factor in equation (6) to correct for the thermal frequency spectrum (e.g. 
eq.[A20]), while our standard intensity weighting given in equation (A19) is optimized for a flat 
non-thermal power-law spectrum with spectral index a = 0. 

We can also filter the gridded estimators in such a way as to enhance or down-weight certain 
signals or noise. We can do this with optimal or Wiener filtering (e.g. Bond & Crittenden 2001), 

= *A (94) 

where the choice of the filter * depends on the application. For example, the covariances C"^ calcu- 
lated from equation (32) can be used to construct optimal filters for each component contributing 
to the observations. For the signal component described by the covariances C-^ we can construct a 
Wiener filter to be applied to the gridded uv estimators 

A^ = C-^ A. (95) 

The amplitudes for the signal models such as the bandpowers qb or the source amplitudes qgrc can 
be set to their Maximum Likelihood values or to fiducial model amplitudes. The Wiener filtered 
image is then recovered by Fourier transforming. 

Examples of images created using this method are described in the next section, and shown in 
Figure 6. Wiener-filtered images are also used in Paper VI to explore the possibility of detection 
of the Sunyaev-Zeldovich effect in the data at high £. 
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9. Implementing the Method 

The algorithm described above was coded as a scalar Fortran {ill compatible) program 
designated cbigridr, with a parallelized Fortran 90 version using OpenMP^ directives also 
available for use on multiprocessor machines. In addition, the BJK likelihood relaxation was coded 
in a second parallelized Fortran 90 program called mlikely using parallel versions of LAPACK 
matrix algebra routines (e.g. §7). Together, these two programs make up the CBI analysis pipeline. 
This pipeline has undergone numerous tests and development since its inception in April 2001, 
and has been used to produce the power spectra and to provide the bandpowers as input to the 
cosmological parameter analysis given in the companion papers. We now give a brief description 
of our implementation. 

In order to carry out the numerical integrations, a fine-grain rectangular lattice in u?>space 
was used. The fine-grain grid size Aufine (in units of wavelength) was chosen to adequately sample 
the phase turns in the uv plane due to the mosaic size and differencing; for a standard 7x6 CBI 
mosaic with 20' spacing, the maximum field separation along the grid direction is Xmax = 2° , which 
gives oscillations in the uv plane with wavelength of x^^g_^ = 28.65, giving Aufinc < 14.3 for two 
samples per cycle. To evaluate the projection operators Ri{v) (e.g. eq.[23]), we store a small fine- 
grain lattice around each Wj. The maximum radius in uv space needed for the support of R is 
ru = 2D/Xram- For CBI D = 90 cm, and Amin = 0.844 cm at 35.5 GHz, we get = 213.1. A grid 
of size 53 x 53 cells with Augne = 8.526 will fit both the sampling and radius requirements. 

The estimators are evaluated on a coarse-grain lattice of Ui, with a spacing of Aucoarse- The 
fine-grain lattices on which we accumulate the Ri{v) will be cross-correlated to form the covariance 

elements and (e.g. eq. [37]), it is desirable to have the coarse grid size locked to integer 
multiples of the fine-grid cells. This coarse grid docs not have to sample the highest mosaic fre- 
quencies, but only the effective width of R. We find that for single CBI fields, Aucoarse = 3 Augne 
is adequate. For CBI mosaics, a hybrid lattice with Aucoarse 

= Aufine ill the inner part (£ < 800) 
and Aucoarse = 2 Aufine the outer part was found to work well. 

As stated in §2, the choice of sign of the exponential of the Fourier transform in equation (1) is 
a convention. This choice varies throughout the literature on the subject, but in practice depends 
upon the way the baseline vectors are defined in the data and how the correlation products are 
made (e.g. which antenna gets the quadrature phase shift). We note that in coding our algorithm 
to conform to the imaging standards of the AIPS^, and DIFMAP^ (Shepherd 1997) packages using 
the CBI data, we had to use the opposite sign convention from the one presented in equation (1). 

To process a dataset, a spectral weighting /{v) and shape function cf^'^^'^ are chosen. The 



*http : //www . opeiunp . org/ 

^http: //www . cv.nrao . edu/ aips/ 

®f tp : //ftp . astro . caltech. edu/pub/difmap/ 
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visibilities are looped over, and any source subtraction (§6.3) is applied. For each estimator i 
that Vk contributes to either directly or as a conjugate, its contribution to the fine-grain lattice q 
for Ri{vq) is accumulated, e.g. 

^{""q) = ^{Qik Pk{Vq) + Qik Pk{Vq)} Vg = Ui + A^Xfine Vg (96) 

k 

where Vq is a 53 x 53 unit (fine-grain) lattice. This means that for ricst estimators, the storage 
required for R is only 2809 x riest double-precision complex numbers. If the data were differenced (as 
for CBI data), then P|™(i)) from equation (70) is used. The contributions to the noise covariance 
elements M'^ and M (eq. [35] ) , and the A^'^'^ (cq. [54] ) are also accumulated at this time. Finally, 
this visibility's contributions are added to estimator Aj and normalization Zi. 

The storage for R, M'^, and m'^ dominate the memory requirements. For example, the CBI 
mosaics use around Uest = 2500 estimators, and thus storage for 53 x 53 x 2500 7 x 10^ double- 
precision complex numbers is needed. A single packed n^^^ ~ 6 x 10^ array is needed to hold 
and M . The C matrices are calculated in place and written out row by row, and thus need not 
be stored. There are no instances where matrices of dimension N^-^^ are stored; the storage for a 
matrix of this size would be prohibitive as our largest CBI mosaics have A^vis > 2 x 10^. 

When all the visibilities have been processed, the estimators are normalized by Zj, split into 
real and imaginary parts, and written out to disk. The covariance matrices Cfj, C^-, and any 
C|J^ are constructed by looping over pairs of rows corresponding to the real and imaginary parts 
of each estimator, e.g. rows i and i + Nest for estimator i. For each j < i, the stored Ri and Rj are 
cross-correlated along with the shape function C(|t?|) to form the bandpowcr covariance elements 
Msij and Msij of equation (37), combined to make C^.^ using equation (32), and stored. The 
relevant columns of these rows of are formed from the stored and M^-. At this point, for 
each C^'''^ desired (there may be more than one, in the CBI analysis we use 3), the relevant A^'''^ are 
combined using equation (55) to form Mfj'^ and M^j, which in turn are used to make Cfj^. After all 
columns for these rows of the covariance matrices are computed, they are written to disk, and this 
process is repeated for the pair of rows corresponding to the next estimator i. When all rows are 
complete, the output file is complete. Note that different binnings of the can be run without 
regridding using the original R, saving significant time. 

If a residual foreground covariance matrix C^^^ is desired, the procedure outlined above is 
repeated in its entirety using the description in §6.4. The spectrum and shape appropriate for 
the source population or foreground emission is used during the gridding and covariance matrix 
construction. Other than these factors, the same gridding as in the CMB and noise estimators 
must be used. 

The output files from cbigridr are then used as input to mlikely. These can be for single 
fields or mosaics, or for combinations of independent fields or mosaics (§7.1). At this point, the 
pre-factors ggrc and Qres for any C^^'^ and C^^^ covariance matrices are chosen and fixed. Relaxation 
to the likelihood maximum is carried out as described in §7, and the resulting bandpowers {qs} 
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and inverse Fisher matrix elements [F ^]bb' are written out. If desired, the bandpower window 
functions W^^^ (§7-2) can be computed if cbigridr was run to produce narrow-bin C^^. The 
component bandpowers q^, gg*^ and q^^^ (§7.3) can also be computed at this time. 

Finally, filtered images using the formalism of §8 can be computed from the estimators, the 
C (at maximum likelihood), and the component covariance matrices. Results from this are shown 
below and in Paper VI. 

The timing for cbigridr depends upon the degree of parallclization as well as the processor 
speed on a given machine, and the number of visibilities griddcd, number of foreground sources, 
and number of bins B for the bandpowers. As an example, the processing of the 14-h mosaic 
field of Paper III (the largest of the datasets) involved gridding 228819 visibilities from 65 separate 
nights of data in 41 fields to 2352 complex estimators. A total of 916 sources were gridded into 
three source covariance matrices. A total of 7 different binnings for were run at this time 
from the same gridding. The execution time using the parallel version of cbigridr was 2^ 40™ 
running on 22 processors on a 32-proccssor Alpha GS320 workstation at the Canadian Institute 
for Theoretical Astrophysics. It then took 3^^' 22™ on the same computer for mlikely to process 
4704 double-precision real estimators in 16 bands, with 3 C^"^^ matrices, one C"^^^ and one C'^. 
This included the time needed to calculate the component bandpowers C^, but not the window 
functions. The speed of this fast gridded method has allowed us to carry out numerous tests on both 
real and simulated dataset, which would not have been possible carrying out maximum likelihood 
(e.g. using even the optimized mlikely) on the 200000-plus visibilities. 



9.1. Method Performance Tests Using Mock CBI Data 

The performance of the method was assessed by applying it to mock CBI datasets. Simulated 
CBI datasets were obtained by replacing the actual visibilities from the data files containing real 
CBI observations of the various fields used in Paper II and Paper III with the response expected for a 
realization of the CMB sky drawn from a representative power spectrum, plus uncorrelated Gaussian 
instrumental noise with the same variance as given by the scatter in the actual CBI visibilities. 
The differencing of the lead and trail fields used in CBI observations was included (e.g. §6.6). This 
mock dataset had the same uv distribution as the real data, and gives an accurate demonstration of 
expected sensitivity levels and the effect of cosmic variance. The power spectrum chosen for these 
simulations was for a model that fit the COBE and BOOMERANG data (Netterfield et al. 2001). 

Figure 2 shows the power spectrum estimation derived following the procedure detailed above. 
The mock datasets were drawn as realizations for the 08h CBI deep field from Paper II. The 
binning of the signal covariance matrix C% was chosen to be uniform in £ with bin width A£ = 500. 
Because a single realization of the sky drawn from the model power spectrum will have individual 
mode powers that deviate from mean given by the power spectrum due to this intrinsic so-called 
"cosmic variance" plus the effect of the thermal instrumental noise, we analyze 387 realizations 
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each taken from a different realization of the sky and a different set of instrumental noise deviates. 

The mean qs for each band B converge to {Cb), which is obtained by integrating the model Q 
over the window functions (e.g. eq.[89]), within the sample uncertainty for the realizations. 
Furthermore, the standard deviation of the qb from the mean for each band agrees with the value 
obtained from the diagonals of the inverse of the Fisher matrix. 

The choice of the i bin size is driven by the trade off between the desired narrow bands 
for localizing features in the power spectrum and the correlations between bins introduced by 
the transform of the primary beam. There is an anti-correlation between adjacent bands seen 
in [F-'^]bb' at the level of -13% to -23% for At = 500 with a single field. We have found that 
correlations up to about —25% give plots of the qs that arc more visually appealing than those made 
with narrower band and higher correlation levels due to the increasing scatter in the bandpowers 
about the mean values. Bins of this size do not achieve the best possible ^-resolution, and thus our 
cosmological parameter runs use finer-binned bands since the correlations are taken into account 
in the analyses. 

The band window functions Wf are shown in the lower panel of Figure 2, and were computed 
using narrow binnings (e.g. eq.[88]) with A£ = 20. The small-scale structures seen in the 

window functions, particularly visible around the peaks, are due to the differencing which introduces 
oscillations (see §6.6). As shown in equation (87), a window function is normalized to sum 
to unity within the given band B, and to sum to zero in the other bands, and thus there must be 
compensatory positive and negative "sidelobes" of the window function outside the band. 

Figure 3 shows the power spectrum derived for a simulated mosaic of 7 x 6 fields separated 
by 20' using the actual CBI 20h mosaics fields from Paper III as a template. This mosaic field 
was chosen as it had incomplete mosaic coverage, and thus would be the most difficult test for the 
method. The binning for C% shown used A£ = 200, which gave adjacent band anti-correlations 
of —13% to —18% in the Fbb'- Again the mean of the 117 realizations converges to the value 
expected within the error bars, showing that there is no bias introduced by the method, even in the 
presence of substantial holes in the mosaic (see Paper III for the mosaic weight map). Furthermore, 
the rms scatter in the realizations converges to the mean of the inverse Fisher error bars, as in the 
single-field case. As in the previous figures, the bandpower window functions are shown in the 
lower panels. 

In Figure 4 are shown three randomly chosen realizations from the ensemble, plotted along 
with the input power spectrum. This shows the level of field-to-field variations that we might expect 
to see in CBI data. There are noticeable deviations from the expected bandpowers in individual 
realizations, particularly at low i where cosmic variance and the highly-correlated bins conspire to 
increase the scatter. These differences are within the expected scatter when bin-bin correlations 
and limited sample size is taken into account, but care must be exercised in interpreting single field 
power spectra. In particular, the acoustic peak structures are obscured by the sample variations. 
However, the average bandpowers for the 3 runs (shown in Figure 4 as open black circles) are better 
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representations of the underlying power spectrum. Although this is not a proper "joint" maximum 
likelihood solution (e.g. §7.1) as is done for the real CBI mosaic fields, the improvement seen using 
the 3-field average leads us to expect that the combination of even 3 mosaic fields damps the single 
field variations sufficiently to begin to see the oscillatory features in the CMB power spectrum. 
While we do not show the equivalent plots of the deep fields from Figure 2, the same behavior 
is seen (with even larger field-to-field fluctuations in the relatively unconstrained first bin, though 
still consistent with the error bars). 

The effect of adding point sources to the mock flelds, and then attempting power spectrum 
extraction, is shown in Figure 5. A set of 200 realizations were made in the same manner as in the 
runs in Figure 2, but the list of point source positions, flux densities and uncertainties, and spectral 
indices from lower frequency used in the analysis in Paper II (the "NVSS" sources) was used to add 
mock sources to the data. The flux density of the sources actually added to the data were perturbed 
using the stated uncertainties as 1-a standard deviations. The errors used were 33% of the flux 
density except for a few of the brighter sources which were put in with 100% uncertainties. We 
then used the methodology described in §6.3 to compute the constraint matrices. The first method 
of correction used was to subtract the (unperturbed) flux densities from the visibilities, and build 
the C^'''^ from A'^'^'^ built using the uncertainties (shown as the red triangles). In addition, we also 
did no subtraction, but built C'^^'^ from A^'^'^ using the full (unperturbed) flux densities (shown as 
the blue squares). This is equivalent to assuming a 100% error on the source flux densities, and 
thus canceling the average source power in those modes. In both cases the factor ggrc = 1 was used. 
The simulations show that both methods are effective, with no discernible bias in the reconstructed 
CMB bandpowers. 

Finally, the production of images using the gridded estimators described in §8 is demonstrated 
in Figure 6. The series of plots show the effect of Wiener filtering using the noise and various signal 
components on an image derived from one of the mock 08h CBI deep field realizations with sources 
from the ensemble shown in Figure 5. The Planck factor weighting of equation (A20) was used 
during gridding to optimize for the thermal CMB spectrum, though in practice this makes little 
difference due to the restricted frequency range of the CBI. The estimators for this realization were 
computed by subtracting the mean values of the source flux densities and putting the standard 
deviations into C^'"'^ with qsrc = 1 (the red triangles in Figure 5). The filtering down- weights the 
high spatial frequency noise seen in the unfiltered image, and effectively separates the CMB and 
source components as shown by comparing panels (c) and (d) to the total signal in panel (b). The 
signal in this realization is dominated by the residuals from two bright point sources that had 100% 
uncertainties put in for their flux densities and thus escaped subtraction. The effectiveness of C®"^^ 
in picking out the sources in the image plane underlines its utility as a constraint matrix in the 
power spectrum estimation. 
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io. Conclusions 

We have outlined a maximum likelihood approach to determining the power spectrum of fluc- 
tuations from interferometric CMB data. This fast gridded method is able to handle the large 
amounts of data produced in large mosaics such as those observed by the CBI. Software encoding 
this algorithm was written, and tested using mock CBI data drawn from a realistic power spectrum. 
The results of the code were shown to converge as expected to the input power spectrum with no 
discernible bias. For small datasets, this code was also tested against independently written soft- 
ware that worked directly on the visibilities. In addition, the pipeline was run with gridding turned 
off as described in §6, again for small test data sets. No bias or significant loss in sensitivity was 
seen in these comparisons. 

This software pipeline was used to analyze the actual CBI data, producing the power spectra 
presented for the deep fields and mosaics in Paper II and Paper III respectively. The output of 
the pipeline also was used as the input for the cosmological parameter analysis in Paper V and the 
investigation of the Sunyaev-Zeldovich Effect in Paper VI. 

This method is of interest for carrying out power spectrum estimation for interferometer ex- 
periments that produce a large number of visibilities but with a significantly smaller number of 
independent samples of the Fourier plane (such as close-packed arrays such as VSA or DASI). The 
CBI pipeline analysis is carried out in two parts, the gridding and covariance matrix construction 
from input uv-FITS files in cbigridr and the maximum likelihood estimation of bandpowers us- 
ing quadratic relaxation in mlikely. The software for the pipeline is available by contacting the 
authors. 

We close by noting that our formalism can be extended to deal with polarization data. In the 
case of CMB polarization, there are as many as six different signal covariance matrices of interest in 
each band, with estimators (or visibilities) for parallel-hand and cross-hand polarization products, 
and thus development of a fast method such as this is critical. In September 2001 polarization 
capable versions of cbigridr and mlikely were written and tested. We describe the method, the 
polarization pipeline, and results in the upcoming paper (Myers et al. 2002, in preparation). 

STM was supported during the early years of the CBI by a Alfred P. Sloan Fellowship from 
1996 to 1999 while at the University of Pennsylvania. Genesis of this method by STM greatly 
benefited by a stay in July 2000 at the ITP in Santa Barbara, supported in part by the National 
Science Foundation under grant PHY99-07949. The National Radio Astronomy Observatory is a 
facility of the National Science Foundation operated under cooperative agreement by Associated 
Universities, Inc. The CBI was funded under NSF grants AST-9413934, AST-9802989 and AST- 
0098734, with contributions by Maxine and Ronald Linde, and Cecil and Sally Drinkward, and 
the strong support of the California Institute of Technology, without which this project would not 
have been possible. In addition, this project has benefited greatly from the computing facilities 
available at CITA, and from discussions with other members of the group at CITA not represented 
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A. Form of the Linear Estimator 

Suppose we were to construct a simple hnear "dirty" mosaic on the sky obtained by a hnear 
combination of the dirty (not deconvolved) images of the individual fields (e.g. Cornwell, Holdaway, 
Sz Uson 1993). In the uv plane, this reduces to summing (integrating) up the visibilities from each 
mosaic "tile" with some weighting function, e.g. 

Ai = J2QikVk (Al) 

k 

where for the time being we ignore the contribution from the complex conjugates of the visibilities 
(see below). For illustrative purposes, let us consider only a single frequency channel and write 
the estimator as a function A(tt), where Aj = A(ui), which in the absence of instrumental noise is 
given by 

A(u) = J (fiv j SxJ^{x,v)Q{u,x,v){V{x,v)) (A2) 

with kernel Q, sky and aperture plane sampling given by and where 

V{x, v) = j <fv' i{v') A{v - v') e^^^^'"" (A3) 

is the visibility at pointing position x and uv locus v from equation (11). In practice, the sampling 
function is just a series of delta functions 

T{x,v) = ^^Uk5^{x - Xk)S^{v - Uk) (A4) 

k 

over the measured visibilities k = 1, . . . , A^vis each with weight uj]^. 

As an ansatz, we let the mosaicing kernel have the form 

Q(w, X, v) = Q{v - u) (A5) 

where Q is the interpolating kernel. Furthermore, let us assume that the ui^plane coverage is the 
same for all mosaic pointings, and thus !F{x,v) is separable 

r{x,v) = F{v)G{x) (A6) 

where F{v) and G{x) are the sampling and weighting in the two domains. Combining these and 
rearranging terms, we get 

A(w) = j (fv' i{v') J (fv F{v) Q{v - u) A{v - v') j (fx G{x) e-2'^^('^-^')-«' (A7) 
= j (fv' i(v')G{u-v') j cPvF{v)Q{v -u)A{v -v') (A8) 
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where in equation (A8) we used the fact that the final right-hand side integral in equation (A7) is 
the Fourier transform G of the mosaic function G. 

For an infinite continuous mosaic, G{v' — u) = 5'^{v' — u) and thus 

A(u) = /(w) J (fvF{v)Q{v -u)A{v -u). (A9) 
If we wish to recover A(ti) = I{u) in this limit, then 

Q{v -u) = A*{v - u) (AlO) 

z{u} 

with normalization 

z{u) = I (fvF{v)A'^{v-u) (All) 



will fulfill our requirements. We have chosen A*{v — u) as the uv kernel as it reproduces the least- 
squares estimate of the sky brightness in the linear mosaic (Cornwell, Holdaway, & Uson 1993). 
Then, equation (A8) becomes 

A{u) = J cPv' i{v') G{u - v') J (fv F{v) A* (v - u) A{v - v') (A12) 

which has a width controlled by the narrower of the width of A^ or the width of G. Thus, by 

widening the mosaic G(x) to a larger area than the beam A, we will fill in the desired information 
inside the A smeared patches in the uihplane. Thus, a properly sampled Af^ mosaic will fill in a 
sub-grid within each uv cell you would have normally had for a single pointing, and thus an 
mosaic consisting of N'^ "images" each is equivalent to a uv super-grid of size (M x N)"^ (e.g. 
Ekers & Rots 1979). 

Note that for a non-continuous mosaic, there will be "aliases" in the uv plane separated by 
the inverse of the mosaic spacing in the sky (Cornwell 1988). Ideally, we would like the separation 
between aliased copies to be larger than the extent of the beam transform, which is satisfied for 
Ax < X/2D which for i:* = 90 cm corresponds to 21^6 at 26.5 GHz and only I6'.l at 35.5 GHz, 
the centers of the extremal CBI bands. The spacing used in the CBI mosaics is a compromise 
between the aliasing limits over the bands and the desire to have a fewer number of pointings on a 
convenient grid. We chose to observe with pointing centers separated by 20', which is sub-optimal 
above 27.5 GHz. However, the effect of aliasing is small, with the overlap point — D 
occurring at the 0.61% point of A at 31 GHz, and the 6.5% point for the highest frequency channel 
at 35.5 GHz. 

We obtain the gridding kernel Qik of equation (Al) corresponding to equation (AlO) by using 
the discrete sampling in equation (A4) 

Qik = — AUuk - Ui) e-2--^-'= (A13) 
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with visibility weights a;^, and normalization factor Zi. The discrete form of the normalization 
derived in equation (All) is 

Zi = ^u;k M (uk -Ui). ( A14) 

k 

Then, 

Ai = - J2^kAUuk-Ui)Vke-^-'^^-^'^ (A15) 
k 

is the weighted sum of visibilities used for the estimators. Note that because V{u) = V*{—u), there 
are also visibilities V^/ for which — tt^.' lies within the support range around Wj, i.e. \A'^,(—Uki—Ui)\ > 
0. Thus, we should add in the complex conjugates 

Ai = -Y,<^k {AUuk - Ui) Vk + Al{-Uk - Ui) Vn e-'^'^'^^-^K (A16) 
k 

To do this, we construct another kernel Qjj. 

Qik = — A-li-Uk - Ui) e-2--»-'= (A17) 

Zi 

which will gather the appropriate V"^*, giving 

^i = Y. i^^fc ^'^ + '^ik Vn (A18) 
k 

for the final form of our linear estimator. 

For estimated visibility variances e^, the optimal weighting factor (in the least-squares estima- 
tion sense) is given by 

ujk = ^ (A19) 

^k 

but may also include factors based on position in the mosaic or frequency channel. For example, 
up until now we have neglected the frequency dependence of the observed visibilities. If we are 
reconstructing an intensity field with a uniform flux density spectrum, then no changes need be 
made. If there is a frequency dependence, such as that for a power-law foreground (e.g. eq.[7]) or the 
thermal spectrum of the CMB (e.g. eq.[6]), then the visibilities should be scaled and weighted by 
the appropriate factor when gridded in order to properly estimate io(wfc) or T{uk respectively. 
For example, for the CMB using equation (6) for the spectrum, we find 

Zi 

^k = (A20) 

In practice for the CBI, the frequency range of the data is not great enough for the spectral weighting 
factor to matter, and we therefore use the default weighting given in equation (A19). This will 
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therefore be slightly non-optimal in the signal-to-noise sense (it will not be the minimum-variance 
estimator) but it will not introduce a bias in the bandpowers. 

The choice of the normalization Zi is somewhat arbitrary, as it only determines the units of 
the Aj and not the correlation properties. However, this can be important if we wish to use the 
estimators to make images using the formalism of §8. For instance, the normalization given in 
equation (A14) has the drawback of diverging in cells where all the are vanishingly small (such 
as the innermost and outermost supported parts of the ut^plane), and will produce images with 
heightened noise on short and long spatial wavelengths. It is therefore more convenient to use the 
alternate normalization 

Zi = ^ujkAl{uk - Ui) (A21) 

k 

which when inserted into equation (A15) will properly normalize the weighted sums of visibilities. 
This will then produce images with the desired units of Janskys per beam (see §8). We therefore 
use equation (A21) for the normalization in our CBI pipeline. 



B. Source Counts and the Residual Covariance Matrix 

We wish to calculate C^^ (cf. eq.[67]) using equation (66) with Uk = = u. If p{a\uQ) is 
independent of flux density, then 

where we have left in the possibility that the upper flux density cutoff will depend on spectral index 
(see below) and set the lower flux density cutoff to zero (the results for realistic power-law counts 
with 7 > — 2 are insensitive to the lower cutoff, but one can easily be included). As an example 
for the calculation of the fluctuation power due to residual sources in the Gaussian limit, consider 
power-law integral source counts 

where N{> S) is the mean number density of sources with flux density greater than S at frequency 
1^0, and a Gaussian spectral index distribution at frequency vq 



p{a\S, uq) = p{a\uQ) = — e ^"'^ . (B3) 

First consider the case where there is a fixed flux density upper cutoff Smax. at the frequency 
where the number counts are defined. The two parts of equation (67) separate easily, where the 
source count part of the integral is 
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For the distribution in equation (B3), the integral over a becomes 



2a poo J ('^-"o) 



iiL da 



1-1 2s2 



'-00 V^TTCT^ 

= e^^^^ff^ (B5) 



where /? = ln(i//i/o), and 



a — an 



^ n n 2 



,2 



a = ao + 2/3(7^ (B6) 

where a is the mean of the extrapolated spectral index distribution which remains a Gaussian, and 
the effective spectral index agff for the spectral component is shifted from the mean spectral index 
of the input distribution ao by the combination of the scatter in the a and the lever arm /? from 
the frequency extrapolation. Putting these together, we get 

^i'^' = -:7T2 (^)'^' ^^^^ 

One can also deal with the case where there is an upper flux density cutoff (Smax imposed at 
a frequency u other than vq where the N{S) distribution is defined. In this case, the flux density 
cutoff in equation (Bl) is 

c (^\ — q {ao-a)p n _ q p-co $ (TiR\ 

'-'maxV,'-'-^ — '-'max "-'max — '-'max V^"/ 

where /3 = ln(i>/^'o), and .Smax is the cutoff -Smax extrapolated to vq using ao- Then, 



Jo 



dS = ^NoSi( ^) e("°-") f (^+2) (B9) 
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"eflf = aQ + l3a1^K^ 
a = ao + 2/3 c7^ K 
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where k gives the modification of the effective spectral index due to the change in the frequency at 
which the cutoff is done. 

One often has an upper fiux density cutoff at two different frequencies. For example, sources 
that are extrapolated to be bright at the CMB observing frequency will have been detected and 
subtracted. If there is a flux density cutoff of Smax imposed at a frequency u as before, but an 
additional upper cutoff of S'^^^ at another frequency u', then there is a critical spectral index 

above which the effective cutoff Smax of equation (B8) changes from that appropriate to i> to that 
at u' (assuming u' > u). Thus, the integral over a in equation (BIO) will be broken into two pieces, 

OA, „ / fU! — 
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J2 = '—NoSl[^^] e2"eff/3 / _I_ e (B13) 

•''acrit V27rc72 

where C^''^ = Ji + J2. The quantities in Ji are as defined in equations (B8) through (Bll), and 
the parameters in J2 are defined in the same way but using the higher frequency z>'. The truncated 
Gaussian integrals are just the integrated probabilities for the normal distribution 

F{x) = r dte-^ = 1 + 1 erf ( (B14) 

with erf(z) the Error Function. Then, 

J^ = —L-NoSU^)'^\'^''-^F{x,,,,) (B15) 
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J2 = --^A^o5oM^)'^'eH«/5[i-F(4^.J] (B16) 



where Xcrit = (acrit - ce)/cra and x'^^^^ = (ocrit - a')/cra- 

As an example, consider the source counts presented in Paper II (§4.3.2), with A^o = 9-2 x 
10^ sr~^ above Sq = 10 mJy at uq = 31 GHz and 7 = —0.875, which gives 

No S^ = 0.715 Jy2 sr"^ (B17) 



7 + 2 



as the raw source power. In the analysis described there, Mason et al. find that a Gaussian 1.4 GHz 
to 31 GHz spectral index distribution with ao = —0.45 and ctq = 0.37 fits the observed data. The 
CBI and OVRO direct measurements have a cutoff of 5max = 25 mJy at 31 GHz (sources brighter 
than this have been subtracted from the CBI data and have residual uncertainties placed in a source 
covariance matrix), and sources above Sjnax = 3.4 mJy at u = 1.4 GHz have already been accounted 
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for in a second source matrix. Therefore, the critical spectral index is acrit = 0.644 from equation 
(B12). For a > acrit, the 31 GHz cutoff holds. Since the cutoff and source distribution are at the 
same frequency as the observations = z/q, there is no extrapolation factor (3 = and the spectral 
index distribution is unchanged (a' = ao). Then, x^^.^ = 2.957 and 1 — -^(a^^rit) ~ ^-^^ ^ 10~^, so 

J2 = -::^NoSl l^^y^' [1-F(4rit)]= 0.003 Jy^sr-i (B18) 

for the flat-spectrum tail of the spectral index integral. The rest of the integral uses the 1.4 GHz 
cutoff, which we extrapolate using the mean spectrum to 31 GHz using equation (B8), 

Sms^ = ^max e-"° ^ = 0.843 mJy, P = -3.098. (B19) 

Because /? = 0, we have to modify the quantities in equation (Bll) by explicitly expanding the 
terms in k, and canceling remaining terms in (3, giving 

a = ao - /3 (t + 2) = 0.027, 2aeS =\fi^ ^^lil + 2)^ = 0.831 (B20) 

which can then be inserted into equation (B15), giving 

Ji = h^NoSl (^Y^ e2"eff/3F(xcnt) = 0.100 Jy^r-i (B21) 

for Xcrit = 1.667, -F(xcrit) ~ 0.952, and thus we expect 

C;:^' = 0.10Jy2sr-i (B22) 

for the amplitude of the residual sources in the CBI fields. In Paper II, it is noted that there is 
a 25% uncertainty on Nq^ and more importantly the power-law slope of the source counts could 
conceivably be as steep as 7 = — 1. Taking the extreme of 7 = —1, we get 

Cr = 0.15Jy2sr-i (B23) 

using the above procedure. Wc thus conservatively estimate a 50% uncertainty on the value of 
derived in this manner. Note that in Paper II we actually use the value of C^^® = 0.08 Jy^ sr~^ 
derived using a Monte-Carlo procedure emulating the integrals in equation (B13) but using the 
actual observed distribution of source flux densities and spectral indices. The agreement between 
these two estimates shows the efficacy of this procedure in practice. 

C. Comparison With the HM Method 

Recently, Hobson &; Maisinger (2002, HM) have independently proposed a binned U'W-plane 

method that is somewhat similar to ours, though it is more directly related to the "optimal maps" 
of Bond &: Crittenden (2001). HM use a gathering mapping H (M in their notation) 



V = Yls + e 



(CI) 
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rather than our scattering kernel Q of equation (21). In the HM method, the vector s can be 
thought of as a set of ideal pixels in the w^^plane. They show that the likelihood depends upon 
binned visibilities v and noise n 

n = (HtE-iH)-^HtE-^e (C2) 

where 

= (nnt) = (HtE-iH)-i. (C3) 

The HM kernel Hjk is chosen to equal 1 if the Uj of visibility Vj lies in cell k, though other more 
complicated kernels could be imagined. The HM method will also give a calculational speedup 
through the reduction in number of independent gridded estimators, and the use of the method is 
demonstrated using simulated VSA data in their paper. 
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Fig. 1. — Graphical representation of the regions of support in the aperture plane for the correlation 
between visibilities on short baselines B < \/2D. Note that both Vi and its conjugate V* = V{—Ui) 
have overlapping support for visibility Vj, and this must be taken into account in computing the 
covariance matrix element. 
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Fig. 2. — Results of the gridded method plus quadratic relaxation for 387 mock CBI 08h deep-field 
datasets (upper panel), with each mock observation drawn from an independent realization of the 
sky given the model power spectrum (shown as the dashed curve) and the instrumental noise with 
the appropriate rms. The points (black circles) are placed at the band centers, at the mean of the 
reconstructed bandpowers with the (red) errorbars given by the scatter among the realizations. The 
(blue) errorbars to the right of the points show the average of the inverse Fisher matrix diagonals. 
The histograms show the width of each band and the level expected by integrating the model Ci 
over the window functions Wf which are shown in the lower panel. The mean of the realizations 
converges to the expected value within the Poisson uncertainty, taking into account the correlations 
between adjacent bands. 



- 44 - 



6000 



:l 4000 



2000 



o 
o 
o 




3000 



Fig. 3. — Upper panel shows the results of the gridded method plus quadratic relaxation for 117 
mock CBI (7x6 field) mosaic datasets with a CDM-based power spectrum (the dashed curve). 
As in Figure 2, the points (black circles) are centered in each bin in i with (red) errorbars giving 
rms scatter of about the mean for the bandpowers from the processed realizations, with the (blue) 
error bar from average inverse Fisher diagonals to the right of the points. The window functions 
are plotted in the lower panel. 
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Fig. 4. — Three randomly selected realizations from the mosaic field simulations shown in Figure 3 
are shown plotted against the model power spectrum (solid black curve). The three lines for the 
bandpowers reconstructed from the three realizations correspond to the bandpowers (central lines) 
and the ±lc7 excursions using the inverse Fisher errorbars. The scatter in bandpowers between 
realizations is within the expected range. Also shown is the unweighted scalar average of the 
bandpowers for the 3 realizations, which is an approximation to a true joint likelihood solution. 
The average is a better fit to the model, as is expected. 
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Fig. 5. — Results using mock deep-field datasets including foreground point sources based on the 
actual list used in the CBI data are shown (upper panel) along with the window functions (lower 
panel). The input power spectrum and expected bandpowers are as in Figure 2. The (green) stars 
show the average for 200 realizations where no source subtraction or projection was done, with the 
powers divided by a factor of 2 to fit on the plot. The expected increase with is seen, along with 
a fallofF in the last bin due to the source frequency spectrum. The points with errorbars at the 
center of each bin (red triangles) were computed from 200 realizations processed with subtraction 
of the mean flux density from the visibilities and construction of C'^'''^ using the uncertainties, while 
the points with errorbars to the right of these (blue squares) are from 200 realizations where no 
source subtraction was done, but we built C'^'''^ using the full (unperturbed) flux densities which 
projects out the sources with only a slight increase in noise. Despite the large power from sources 
at high our method successfully removes the foreground power from the spectrum with no sign 
of bias. 
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Fig. 6. — Images reconstructed from the gridded estimators using the formahsm of §8. Data is 
for one of the mock 08h deep-field reahzations with sources used in Figure 5. Shown are: Panel 
(a) upper left — image computed without any filtering; (b) upper right — the image derived by 
setting C-^ = C'^'''^ + qb the sum of the signal terms; (c) lower left — the image using 
C-^ = J2b 1b ^% for the CMB only; (d) lower right — image for C-^ = C**'''^ to pick out the point 
sources. The filter clearly dampens the noise, and separates the CMB and source components. The 
residuals from several bright sources dominate the signal in all but the CMB-filtered image (the 
brightness scale in that image is enhanced). The white dashed circle shows the 45^2 FWHM of the 
CBI at 31 GHz. The attenuation of the signal brightness due to the square of the primary beam is 
clearly seen. 



